-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for #158 by using normpath #169
Conversation
Thanks @adegomme . Does this also fix the other problems? I tried checking out this branch, upgrading to
Edit: note, the above is with the
Why does the |
I was able to run the fast one without failure this weekend, on the docker version of qm. I did not have enough memory for the moderate one, but I was able to launch it anyway. I think the first error is due to a crash during the run of BigDFT
So this error means that no output was correctly generated (logfile file couldn't be retrieved/parsed) ? It's a bit odd that the second one goes further and does not report a failure at the calculation level, if the first one failed this way... |
I tried to rerun, but now it even blows up my machine. I am pretty sure it is due to the heavy memory usage as you said. I have 16 GB, but my machine stalls almost instantly and have to hard reboot it. Is this normal for BigDFT to use such a huge amount of memory for a small silicon system? Not sure how to test this now... @bosonie this together with the long walltimes required for CP2K, it seems like we have to come up with some kind of system to warn users about resource requirements. If they try in the QM and it fails like it does now, they are going to assume there is a problem with the code whereas it is simply a lack of requirements. I think we should definitely update the SI and documentation to give these required estimates of resources because otherwise it is going to cause problems I am sure |
@sphuber this has always been my concern and a clear warning is in the SI of the paper. However it is just a generic warning, not specific. We need to systematically gather data in order to be more precise. |
For Si, as we have to make it larger to run it for now (8 atoms instead of two), it's quite specific. Al should be lighter and takes much less time to run as well. But yes, this is too heavy for such small cases, I will see to remove some costly options in fast mode... |
They were all usings settings for precise since a few versions, leading to long computation times/memory usage.
K point values were badly computed since a few versions, resulting in too many k-points being used for fast and moderate protocol. These new values use much less memory and time. |
Thanks for the updates @adegomme . I tried again and now the fast Si relax works without problems and in a reasonable amount of time. However, the moderate still fails for the same reason. The calcjob and base workchain finish successfully, but the relax workchain stops with a
and the report
Looking at the outputs of the
but then again, that is also not the case for the run with fast protocol, which worked just fine. So I think the Actually, I figured it out. While preparing the output files to upload them here, I noticed that the job was killed by the scheduler. The problem is that the parser doesn't check this and simply happily returns a |
@adegomme I have opened three issues on |
Final wrap up: I have tested this branch on |
Fixes #158
There is also a fix for a related issue published in aiida-bigdft, so version was bumped.